Introduction

Welcome to Practical 1, an introduction to using R and JAGS. In this practical we’ll

You should follow and run the commands shown in the grey boxes. At various points you will see a horizontal line in the text which indicates a question you should try to answer, like this:


What words does the following command print to the console?

print("Hello World")

If you get stuck, please get our attention and we will try to help! There are no prepared answers to these questions so keep you own record as you go. At the end of the practical are harder questions which you can attempt if you get through all of the material. If you find any mistakes in the document please let us know.

You can run the code from these practicals by loading up the .Rmd file in the same directory in Rstudio. This is an R markdown document containing all the text. Feel free to add in your own answers, or edit the text to give yourself extra notes. You can also run the code directly by highlighting the relevant code and clicking Run.

Getting started with R

R is an open-source object oriented language for doing statistics. All the data you load in, the internal functions and any functions you write are objects with attributes. We’ll get a feel for what this means as we go along.

If you need help with a function, you can type ?functionname into the command line, for example ?print

Loading some data


Use the read.csv() function [a variant of read.table()] to load up the data in file 'data/hadrut.csv'

Use the str() and head() functions to interrogate the resulting R object. What class of object is it?

#hadcrut = read.csv('../data/hadcrut.csv')
hadcrut = read.csv('~/Github/tsme_course/data/hadcrut.csv')
head(hadcrut)
##   Year Anomaly
## 1 1850  -0.412
## 2 1851  -0.237
## 3 1852  -0.262
## 4 1853  -0.335
## 5 1854  -0.258
## 6 1855  -0.255
with(hadcrut,plot(Year,Anomaly,type='l'))


You can plot the data using the plot() function. Is there a way to make the structure of the timeseries clearer? [Hint: look for the “type” option]

You can convert the Anomaly data to a timeseries object using the function ts() if you like. To keep things simple, you’ll need to use the start, end or possibly frequency attributes.


Autocorrelation in the time series

You can create a scatter plot in R using plot(x,y), where x and y are your variables.

Create a set of lagged scatter plots of the time series, to see if there is a hint of autocorrelation in the data set.

[Note: we need to remove the trend first]

hadcrut_smooth = lowess(hadcrut$Anomaly)$y
hadcrut_resid = hadcrut$Anomaly - hadcrut_smooth
plot(hadcrut_resid)

par(mfrow = c(2,2))
for(i in 1:4){
t = hadcrut_resid[(i+1):length(hadcrut_resid)]
t_minus_i = hadcrut_resid[(i:length(hadcrut_resid)-i)]
plot(t, t_minus_i)
}


pacf(with(hadcrut, Anomaly))


BONUS: Moving window effects



Removing Seasonality

A quick introduction to JAGS

In this part of the practical, we will:

  1. Fit a linear model with JAGS to simulated and real data
  2. Fit a logistic regression with JAGS to simulated data

Estimate the coefficients of the linear model

\(y = \alpha + \beta x_{i} + \epsilon\)

Where \(\alpha\) is a constant, \(\beta\) is the regression coefficient and \(\epsilon\) is gaussian noise with variance \(\sigma^{2}\), and is independent across observations:

\[\epsilon \sim N(0, \sigma^{2})\]

# Likelihood:
# y_t ~ N(alpha + beta * x[i], sigma^2)
# Prior
# alpha ~ N(0,100) - vague priors
# beta ~ N(0,100)
# sigma ~ U(0,10)

Simulate some data

Here is some R code that you can run to simulate data from the linear model. Plot the results.

T = 50
alpha = 2
beta = 3
#sigma = 1
sigma = 3
# Set the seed so this is repeatable
set.seed(123)
x = sort(runif(T, 0, 10)) # Sort as it makes the plotted lines neater
y = rnorm(T, mean = alpha + beta * x, sd = sigma)
plot(x, y)
lines(x, alpha + beta * x)

Input a JAGS model

We’d like to fit a model to the simulated data. This is the way that we specify the Jags model code in R:

model_code = '
model
{
  # Likelihood
  for (t in 1:T) {
    y[t] ~ dnorm(alpha + beta * x[t],tau)
  }

  # Andrews Priors
  #alpha ~ dnorm(0.0,0.01)
  #beta ~ dnorm(0.0,0.01)
  #tau <- 1/pow(sigma,2) # Turn precision into standard deviation
  #sigma ~ dunif(0.0,10.0)
  #Dougs priors
  alpha ~ dnorm(0.0,3)
  beta ~ dnorm(0.0,3)
  tau <- 1/pow(sigma,2) # Turn precision into standard deviation
  sigma ~ dunif(0.0,10.0)
}
'

We specify a likelihood and priors for each of the parameters. We specify a prior for sigma, but need to calculate precision (\(1/\sigma^{2}\)) for the likelihhod.


What do the priors look like? Plot them up.


# Set up the data as a list object, and inspect it
model_data = list(T = T, y = y, x = x)
str(model_data)
## List of 3
##  $ T: num 50
##  $ y: num [1:50] -2.32 5.78 3.83 1.67 9.93 ...
##  $ x: num [1:50] 0.246 0.421 0.456 1.029 1.388 ...
# Choose the parameters to watch
model_parameters =  c("alpha","beta","sigma")

# Run the model
model_run = jags(data = model_data,
                 parameters.to.save = model_parameters,
                 model.file=textConnection(model_code),
                 n.chains=4, # Number of different starting positions
                 n.iter=1000, # Number of iterations
                 n.burnin=200, # Number of iterations to remove at start
                 n.thin=2) # Amount of thinning
## module glm loaded
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 50
##    Unobserved stochastic nodes: 3
##    Total graph size: 214
## 
## Initializing model

Simulated results

plot(model_run)

print(model_run)
## Inference for Bugs model at "5", fit using jags,
##  4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
##  n.sims = 1600 iterations saved
##          mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat
## alpha      1.113   0.496   0.145   0.767   1.131   1.435   2.043 1.002
## beta       3.089   0.099   2.888   3.027   3.088   3.155   3.277 1.000
## sigma      2.985   0.326   2.430   2.751   2.951   3.187   3.692 1.003
## deviance 250.010   3.238 245.340 247.582 249.589 251.754 257.576 1.003
##          n.eff
## alpha     1400
## beta      1600
## sigma      730
## deviance   890
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 5.2 and DIC = 255.2
## DIC is an estimate of expected predictive error (lower deviance is better).
traceplot(model_run)

# Create a plot of the posterior mean regression line
post = print(model_run)
## Inference for Bugs model at "5", fit using jags,
##  4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
##  n.sims = 1600 iterations saved
##          mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat
## alpha      1.113   0.496   0.145   0.767   1.131   1.435   2.043 1.002
## beta       3.089   0.099   2.888   3.027   3.088   3.155   3.277 1.000
## sigma      2.985   0.326   2.430   2.751   2.951   3.187   3.692 1.003
## deviance 250.010   3.238 245.340 247.582 249.589 251.754 257.576 1.003
##          n.eff
## alpha     1400
## beta      1600
## sigma      730
## deviance   890
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 5.2 and DIC = 255.2
## DIC is an estimate of expected predictive error (lower deviance is better).
alpha_mean = post$mean$alpha
beta_mean = post$mean$beta

plot(x, y)
lines(x, alpha_mean + beta_mean * x, col = 'red')
lines(x, alpha + beta * x, col = 'blue')
legend('topleft',
       legend = c('Truth', 'Posterior mean'),
       lty=1,
       col=c('blue','red'))

# Blue and red lines should be pretty close

  1. With the simulated data, experiment with changing the value of N when creating the data? What happens to the posterior distribution of the parameters as N gets bigger?
  2. Try experimenting with the priors. Suppose that you knew that beta was negative and used the prior beta ~ dunif(-2,-1). What happens to the posterior mean lines?

Some real data - Sea level rise

Church and White (2011) estimated sea level rise from the late 19th to the early 21st century, splicing tide guage and satellite altimetry data together.

Load and have a look at the data set using the following code:

sea_level = read.csv('https://raw.githubusercontent.com/andrewcparnell/tsme_course/master/data/church_and_white_global_tide_gauge.csv')
head(sea_level)
##   year_AD sea_level_m age_error sea_level_error
## 1  1880.5   -0.159065       0.5          0.0242
## 2  1881.5   -0.153465       0.5          0.0242
## 3  1882.5   -0.170265       0.5          0.0230
## 4  1883.5   -0.164965       0.5          0.0228
## 5  1884.5   -0.144065       0.5          0.0222
## 6  1885.5   -0.145565       0.5          0.0219

  1. What is the rate of sea level rise for the whole data set? Ignore the errors for the moment and run a linear regression model.

  2. Plot the residuals through time. What do you notice? What simple alteration to the linear regression could you make to better capture the sea level rise?


A logistic regression example

  • Give less code this time (just the necessary stuff)
library(boot) # Package contains the logit transform
# Maths -------------------------------------------------------------------

# Description of the Bayesian model fitted in this file
# Notation:
# y_t = binomial (often binary) response variable for observation t=1,...,N
# x_{1t} = first explanatory variable for observation t
# x_{2t} = second " " " " " " " " "
# p_t = probability of y_t being 1 for observation t
# alpha = intercept term
# beta_1 = parameter value for explanatory variable 1
# beta_2 = parameter value for explanatory variable 2

# Likelihood
# y_t ~ Binomial(K,p_t), or Binomial(1,p_t) if binary
# logit(p_t) = alpha + beta_1 * x_1[t] + beta_2 * x_2[t]
# where logit(p_i) = log( p_t / (1 - p_t ))
# Note that p_t has to be between 0 and 1, but logit(p_t) has no limits

# Priors - all vague
# alpha ~ normal(0,100)
# beta_1 ~ normal(0,100)
# beta_2 ~ normal(0,100)

# Simulate data -----------------------------------------------------------

# Some R code to simulate data from the above model
T = 100
set.seed(123)
x_1 = sort(runif(T,0,10))
x_2 = sort(runif(T,0,10))
alpha = 1
beta_1 = 0.2
beta_2 = -0.5
logit_p = alpha + beta_1 * x_1 + beta_2 * x_2
p = inv.logit(logit_p)
y = rbinom(T,1,p)

# Have a quick look at the effect of x_1 and x_2 on y
plot(x_1,y)

plot(x_2,y) # Clearly when x is high y tends to be 0

# Jags code ---------------------------------------------------------------

# Jags code to fit the model to the simulated data
model_code = '
model
{
  # Likelihood
  for (t in 1:T) {
    y[t] ~ dbin(p[t], K)
    logit(p[t]) <- alpha + beta_1 * x_1[t] + beta_2 * x_2[t]
  }

  # Priors
  alpha ~ dnorm(0.0,0.01)
  beta_1 ~ dnorm(0.0,0.01)
  beta_2 ~ dnorm(0.0,0.01)
}
'

# Set up the data
model_data = list(T = T, y = y, x_1 = x_1, x_2 = x_2, K = 1)

# Choose the parameters to watch
model_parameters =  c("alpha", "beta_1", "beta_2")

# Run the model
model_run = jags(data = model_data,
                 parameters.to.save = model_parameters,
                 model.file = textConnection(model_code),
                 n.chains = 4,
                 n.iter = 1000,
                 n.burnin = 200,
                 n.thin = 2)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 100
##    Unobserved stochastic nodes: 3
##    Total graph size: 711
## 
## Initializing model
# Check the output - are the true values inside the 95% CI?
# Also look at the R-hat values - they need to be close to 1 if convergence has been achieved
plot(model_run)

print(model_run)
## Inference for Bugs model at "6", fit using jags,
##  4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
##  n.sims = 1600 iterations saved
##          mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat
## alpha      1.652   0.770   0.189   1.130   1.639   2.169   3.236 1.000
## beta_1     0.595   0.854  -1.120  -0.007   0.637   1.179   2.302 1.002
## beta_2    -1.018   0.949  -2.909  -1.671  -1.057  -0.341   0.861 1.001
## deviance 119.242   2.440 116.399 117.421 118.599 120.455 125.367 1.007
##          n.eff
## alpha     1600
## beta_1    1400
## beta_2    1500
## deviance   530
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 3.0 and DIC = 122.2
## DIC is an estimate of expected predictive error (lower deviance is better).
traceplot(model_run)

# Create a plot of the posterior mean regression line
post = print(model_run)
## Inference for Bugs model at "6", fit using jags,
##  4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
##  n.sims = 1600 iterations saved
##          mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat
## alpha      1.652   0.770   0.189   1.130   1.639   2.169   3.236 1.000
## beta_1     0.595   0.854  -1.120  -0.007   0.637   1.179   2.302 1.002
## beta_2    -1.018   0.949  -2.909  -1.671  -1.057  -0.341   0.861 1.001
## deviance 119.242   2.440 116.399 117.421 118.599 120.455 125.367 1.007
##          n.eff
## alpha     1600
## beta_1    1400
## beta_2    1500
## deviance   530
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 3.0 and DIC = 122.2
## DIC is an estimate of expected predictive error (lower deviance is better).
alpha_mean = post$mean$alpha
beta_1_mean = post$mean$beta_1
beta_2_mean = post$mean$beta_2

# As we have two explanatory variables I'm going to create two plots
# holding one of the variables fixed whilst varying the other
par(mfrow = c(2, 1))
plot(x_1, y)
lines(x_1,
      inv.logit(alpha_mean + beta_1_mean * x_1 + beta_2_mean * mean(x_2)),
      col = 'red')
plot(x_2, y)
lines(x_2,
      inv.logit(alpha_mean + beta_1_mean * mean(x_1) + beta_2_mean * x_2),
      col = 'red')

# Line for x_1 should be increasing with x_1, and vice versa with x_2

# Real example ------------------------------------------------------------

# Data wrangling and jags code to run the model on a real data set in the data directory

# Adapted data from Royla and Dorazio (Chapter 2)
# Moth mortality data
T = 12
K = 20
y = c(1,4,9,13,18,20, 0,2,6,10,12,16)
sex = c(rep('male',6), rep('female',6))
dose = rep(0:5, 2)
sexcode = as.integer(sex == 'male')
# The key questions is: what are the effects of dose and sex?

# Set up the data
real_data = list(T = T, K = K, y = y, x_1 = sexcode, x_2 = dose)

# Run the mdoel
real_data_run = jags(data = real_data,
                     parameters.to.save = model_parameters,
                     model.file = textConnection(model_code),
                     n.chains = 4,
                     n.iter = 1000,
                     n.burnin = 200,
                     n.thin = 2)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 12
##    Unobserved stochastic nodes: 3
##    Total graph size: 79
## 
## Initializing model
# Plot output
print(real_data_run)
## Inference for Bugs model at "7", fit using jags,
##  4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
##  n.sims = 1600 iterations saved
##          mu.vect sd.vect   2.5%    25%    50%    75%  97.5%  Rhat n.eff
## alpha     -3.360   0.515 -4.293 -3.685 -3.367 -3.059 -2.400 1.009   320
## beta_1     1.055   0.358  0.344  0.821  1.057  1.296  1.764 1.050   250
## beta_2     1.034   0.145  0.785  0.957  1.034  1.122  1.290 1.007   620
## deviance  40.210   5.821 37.052 38.039 39.093 40.774 47.123 1.008   980
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 16.9 and DIC = 57.1
## DIC is an estimate of expected predictive error (lower deviance is better).
# Create same plot as before (only for does though)
post = print(real_data_run)
## Inference for Bugs model at "7", fit using jags,
##  4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
##  n.sims = 1600 iterations saved
##          mu.vect sd.vect   2.5%    25%    50%    75%  97.5%  Rhat n.eff
## alpha     -3.360   0.515 -4.293 -3.685 -3.367 -3.059 -2.400 1.009   320
## beta_1     1.055   0.358  0.344  0.821  1.057  1.296  1.764 1.050   250
## beta_2     1.034   0.145  0.785  0.957  1.034  1.122  1.290 1.007   620
## deviance  40.210   5.821 37.052 38.039 39.093 40.774 47.123 1.008   980
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 16.9 and DIC = 57.1
## DIC is an estimate of expected predictive error (lower deviance is better).
alpha_mean = post$mean$alpha
beta_1_mean = post$mean$beta_1
beta_2_mean = post$mean$beta_2

# Look at effect of sex - quantified by beta_1
hist(real_data_run$BUGSoutput$sims.list$beta_1, breaks = 30)
# Seems positive - males more likely to die

# What about effect of dose?
o = order(real_data$x_2)
par(mfrow=c(1,1)) # Reset plots

with(real_data,plot(x_2, y, pch = sexcode)) # Data
# Males
with(real_data,
     lines(x_2[o],
           K*inv.logit(alpha_mean + beta_1_mean * 1 + beta_2_mean * x_2[o]),
           col = 'red'))
# Females
with(real_data,
     lines(x_2[o],
           K*inv.logit(alpha_mean + beta_1_mean * 0 + beta_2_mean * x_2[o]),
           col = 'blue'))

# Legend
legend('topleft',
       legend = c('Males', 'Females'),
       lty = 1,
       col = c('red','blue'))

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.